PlantTFDB
Plant Transcription Factor Database
v4.0
Previous version: v3.0
Transcription Factor Information
Basic Information | Signature Domain | Sequence | 
Basic Information? help Back to Top
TF ID Thecc1EG015114t4
Common NameTCM_015114
Organism
Taxonomic ID
Taxonomic Lineage
cellular organisms; Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliophyta; Mesangiospermae; eudicotyledons; Gunneridae; Pentapetalae; rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma
Family HD-ZIP
Protein Properties Length: 527aa    MW: 57923 Da    PI: 6.8236
Description HD-ZIP family protein
Gene Model
Gene Model ID Type Source Coding Sequence
Thecc1EG015114t4genomeCGDView CDS
Signature Domain? help Back to Top
Signature Domain
No. Domain Score E-value Start End HMM Start HMM End
1Homeobox64.71.3e-2054110157
                       TT--SS--HHHHHHHHHHHHHSSS--HHHHHHHHHHCTS-HHHHHHHHHHHHHHHHC CS
          Homeobox   1 rrkRttftkeqleeLeelFeknrypsaeereeLAkklgLterqVkvWFqNrRakekk 57 
                       ++k +++t++q++eLe++F+++++p++++r eL+++l L+ +q+k+WFqNrR+++k+
  Thecc1EG015114t4  54 KKKYHRHTPHQIQELESFFKECPHPDEKQRLELSRRLALESKQIKFWFQNRRTQMKN 110
                       79999*************************************************995 PP

2START133.62.2e-4210825669205
                       GG.CT-TT-S....EEEEEEEECTT......EEEEEEEEXXTTXX-SSX.EEEEEEEEEEE.TTS-EEEEEEEEE-TTS--.-TTSEE-EESS CS
             START  69 ke.qWdetla....kaetlevissg......galqlmvaelqalsplvp.RdfvfvRyirqlgagdwvivdvSvdseqkppesssvvRaellp 149
                       ++ +W e+++    +++t++v+ss+      ++lq+m ae+q+lsplvp R + f+R+++q+++ +w++vdvS+d  q+  + + +  +++lp
  Thecc1EG015114t4 108 MKnRWAEMFPcmisRVATIDVLSSAtgvtrdNTLQVMDAEFQVLSPLVPvRQVRFLRFCKQHTERVWAVVDVSIDASQDAASAQMFPNCRRLP 200
                       3358****************************************************************************9888899****** PP

                       EEEEEEEECTCEEEEEEEE-EE--SSXXHHHHHHHHHHHHHHHHHHHHHHTXXXXX CS
             START 150 Sgiliepksnghskvtwvehvdlkgrlphwllrslvksglaegaktwvatlqrqce 205
                       Sg++i++++n +skvtwveh +++++ +h llr+l+++g  +ga +w+atlqrqc 
  Thecc1EG015114t4 201 SGCVIQDMDNKYSKVTWVEHSEYDDSAVHHLLRPLLSYGFGFGAHRWLATLQRQCD 256
                       ******************************************************96 PP

Protein Features ? help Back to Top
3D Structure
Database Entry ID E-value Start End InterPro ID Description
SMARTSM002345.4E-1837257IPR002913START domain
Gene3DG3DSA:1.10.10.601.6E-2140110IPR009057Homeodomain-like
SuperFamilySSF466893.34E-2040110IPR009057Homeodomain-like
PROSITE profilePS5007117.5451111IPR001356Homeobox domain
SMARTSM003892.9E-1753115IPR001356Homeobox domain
PfamPF000462.9E-1854109IPR001356Homeobox domain
CDDcd000865.75E-1954112No hitNo description
PROSITE patternPS00027086109IPR017970Homeobox, conserved site
PROSITE profilePS5084827.274101260IPR002913START domain
PfamPF018523.7E-36108256IPR002913START domain
CDDcd088758.99E-77109255No hitNo description
SuperFamilySSF559617.0E-21110257No hitNo description
SuperFamilySSF559611.51E-18285520No hitNo description
Gene Ontology ? help Back to Top
GO Term GO Category GO Description
GO:0006355Biological Processregulation of transcription, DNA-templated
GO:0005634Cellular Componentnucleus
GO:0008289Molecular Functionlipid binding
GO:0043565Molecular Functionsequence-specific DNA binding
Sequence ? help Back to Top
Protein Sequence    Length: 527 aa     Download sequence    Send to blast
MDAHGEMGLI GENFDPGLVG RMKEDGYESR SGSDNFEGAS GDDQDAADDG RPKKKKYHRH  60
TPHQIQELES FFKECPHPDE KQRLELSRRL ALESKQIKFW FQNRRTQMKN RWAEMFPCMI  120
SRVATIDVLS SATGVTRDNT LQVMDAEFQV LSPLVPVRQV RFLRFCKQHT ERVWAVVDVS  180
IDASQDAASA QMFPNCRRLP SGCVIQDMDN KYSKVTWVEH SEYDDSAVHH LLRPLLSYGF  240
GFGAHRWLAT LQRQCDCLAV LMSPNIPGEE NTGITPAGRK NMLKLAQRMT YNFCAGVCAS  300
SVHKWDKLSV GNVGEDVRVM TRKNIDDPGE PAGVVLSAAT SVWMPITQQR LFDFLRDERM  360
RSQWDILSNG GPMQGMVKIA KGPGHGNCVS LLRGSAINAN ENNMLILQET WSDASGALVV  420
YAPVDISSIG VVMNGGDSAY VALLPSGFAI LPGISPSYHG GQSNSNGPMV KPDIDGSISG  480
GCLLTVGFQI LVNSLPTAKL TVESVETVNN LISCTIQKIK AALTVT*
Regulation -- PlantRegMap ? help Back to Top
Source Upstream Regulator Target Gene
PlantRegMapRetrieve-
Annotation -- Protein ? help Back to Top
Source Hit ID E-value Description
RefseqXP_007038624.10.0Homeobox-leucine zipper family protein / lipid-binding START domain-containing protein isoform 4
SwissprotQ0WV120.0ANL2_ARATH; Homeobox-leucine zipper protein ANTHOCYANINLESS 2
TrEMBLA0A061FZZ00.0A0A061FZZ0_THECC; Homeobox-leucine zipper family protein / lipid-binding START domain-containing protein isoform 4
STRINGGLYMA09G40130.10.0(Glycine max)
Best hit in Arabidopsis thaliana ? help Back to Top
Hit ID E-value Description
AT4G00730.10.0HD-ZIP family protein
Publications ? help Back to Top
  1. Motamayor JC, et al.
    The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.
    Genome Biol., 2013. 14(6): p. r53
    [PMID:23731509]